Experiments on Routing, Filtering and Chinese Text Retrieval in TREC-5

نویسندگان

  • Chong-Wah Ngo
  • Kok F. Lai
چکیده

We describes our experiments in the routing, ltering and Chinese text retrieval. We based our routing and ltering experiments on our discriminant project algorithm. The algorithm sequentially constructs a series of orthogonal axis from the training documents using the Gram-Schmidt procedure. It then rotates the resulting subspace using principal component analysis so that the axis are ordered by their importance. For Chinese text retrieval, we experimented both with an automatic method and a manual method. For the automatic method, we use all phrases in the description eld and compute the aggregate scores using the simple tf:id f formula. We then manually construct boolean phrase queries which are thought to improve the results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Term importance, Boolean conjunct training, negative terms, and foreign language retrieval: probabilistic algorithms at TREC-5

The Berkeley experiments for TREC-5 extend those of TREC-4 in numerous ways. For routing retrieval we experimented with the idea of term importance in three ways -training on Boolean conjuncts of the most important terms, filtering with the most important terms, and, finally, logistic regression on presence or absence of those terms. For ad-hoc retrieval we retained the manual reformulations of...

متن کامل

PLIERS at TREC8

The use of the PLIERS text retrieval system in TREC8 experiments is described. The tracks entered for are: Ad-Hoc, Filtering (Batch and Routing) and the Web Track (Large only). We describe both retrieval efficiency and effectiveness results for all these tracks. We also describe some preliminary experiments with BM_25 tuning constant variation.

متن کامل

TREC 11 Experiments at CAS-ICT: Filtering and Web

CAS-ICT took part in the TREC conference for the second time this year and we undertook two tracks of TREC-11. For filtering track, we have submitted results of all three subtasks. In adaptive filtering, we paid more attention to undetermined documents processing, profile building and adaptation. In batch filtering and routing, a centroid-based classifier is used with preprocessed samples. For ...

متن کامل

JHU/APL at TREC 2001: Experiments in Filtering and in Arabic, Video, and Web Retrieval

The outsider might wonder whether, in its tenth year, the Text Retrieval Conference would be a moribund workshop encouraging little innovation and undertaking few new challenges, or whether fresh research problems would continue to be addressed. We feel strongly that it is the later that is true; our group at the Johns Hopkins University Applied Physics Laboratory (JHU/APL) participated in four...

متن کامل

TREC-9 Cross-Language Information Retrieval (English-Chinese) Overview

(English Chinese) Overview Fredri Gey and Aitao Chen UC DATA and SIMS University of California, Berkeley e-mail: gey u data.berkeley.edu,aitao sims.berkeley.edu Abstra t Sixteen groups parti ipated in the TREC-9 ross-language information retrieval tra k whi h fo ussed on retrieving Chinese language do uments in response to 25 English queries. A variety of CLIR approa hes were tested and a ri h ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996